Exploiting Audio-visual Correlation in Coding of Talking Head Sequences

نویسندگان

Ram R. Rao

Tsuhan Chen

چکیده

TALKING HEAD SEQUENCES Ram R. Rao Georgia Institute of Technology Atlanta, GA 30332 [email protected] Tsuhan Chen AT&T Bell Laboratories Holmdel, NJ 07733 [email protected] ABSTRACT In this paper, we present a novel means for predicting the shape of a person's mouth from the corresponding speech signal and explore applications of this prediction to video coding. One possible application is cross-modal predictive coding. In the cross-modal predictive coding system described in this paper, a model-based video coder compares measured visual parameters with predicted visual parameters, and sends the di erence between the two to the receiver. Since the decoder also receives the acoustic data, it can form the prediction and then reconstruct the original parameters by adding the transmitted error signal.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Framework for Data-driven Video-realistic Audio-visual Speech-synthesis

In this work, we present a framework for generating a video-realistic audio-visual “Talking Head”, which can be integrated in applications as a natural Human-Computer interface where audio only is not an appropriate output channel especially in noisy environments. Our work is based on a 2D-video-frame concatenative visual synthesis and a unit-selection based Text -to-Speech system. In order to ...

متن کامل

Data-Driven Tools for Designing Talking Heads Exploiting Emotional Attitudes

Audio/visual speech, in the form of labial movement and facial expression data, was utilized in order to semi-automatically build a new Italian expressive and emotive talking head capable of believable and emotional behavior. The methodology, the procedures and the specific software tools utilized for this scope will be described together with some implementation examples.

متن کامل

Audio-visual speech asynchrony modeling in a talking head

An audio-visual speech synthesis system with modeling of asynchrony between auditory and visual speech modalities is proposed in the paper. Corpus-based study of real recordings gave us the required data for understanding the problem of modalities asynchrony that is partially caused by the coarticulation phenomena. A set of context-dependent timing rules and recommendations was elaborated in or...

متن کامل

Text Driven 3D Photo-Realistic Talking Head

We propose a new 3D photo-realistic talking head with a personalized, photo realistic appearance. Different head motions and facial expressions can be freely controlled and rendered. It extends our prior, high-quality, 2D photo-realistic talking head to 3D. Around 20-minutes of audio-visual 2D video are first recorded with read prompted sentences spoken by a speaker. We use a 2D-to-3D reconstru...

متن کامل

The Development of a Brazilian Talking Head

This paper describes partial results of a research, in progress at the School of Electrical and Computer Engineering of the State University of Campinas, aimed at developing a realistic three-dimensional Brazilian Talking Head. Through an extensive analysis of a video-audio linguistic corpus, a set of 29 phonetic context-dependent visemes (22 consonantal plus 7 vocalic visemes), that accommodat...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1996

Exploiting Audio-visual Correlation in Coding of Talking Head Sequences

نویسندگان

چکیده

منابع مشابه

A Framework for Data-driven Video-realistic Audio-visual Speech-synthesis

Data-Driven Tools for Designing Talking Heads Exploiting Emotional Attitudes

Audio-visual speech asynchrony modeling in a talking head

Text Driven 3D Photo-Realistic Talking Head

The Development of a Brazilian Talking Head

عنوان ژورنال:

اشتراک گذاری